Wavelet-based relative prefix sum methods for range sum queries in data cubes
نویسنده
چکیده
Data mining and related applications often rely on extensive range sum queries and thus, it is important for these queries to scale well. Range sum queries in data cubes can be achieved in time O(1) using prefix sum aggregates but prefix sum update costs are proportional to the size of the data cube O ( nd ) . Using the Relative Prefix Sum (RPS) method, the update costs can be reduced to the root of the size of the data cube O ( nd/2 ) . We present a new family of base b wavelet algorithms further reducing the update costs to O ( nd/β ) for β as large as we want while preserving constant-time queries. We also show that this approach leads to O ( logd n ) query and update methods twice as fast as Haarbased methods. Moreover, since these new methods are pyramidal, they provide incrementally improving estimates.
منابع مشابه
Wavelet - Based Relative Prefix Sum Methods for Range Sum Queries in Data
Data mining and related applications often rely on extensive range sum queries and thus, it is important for these queries to scale well. Range sum queries in data cubes can be achieved in time O(1) using prefix sum aggregates but prefix sum update costs are proportional to the size of the data cube O ( nd ) . Using the Relative Prefix Sum (RPS) method, the update costs can be reduced to the ro...
متن کاملRelative Prefix Sums: An Efficient Approach for Querying Dynamic OLAP Data Cubes
Range sum queries on data cubes are a powerful tool for analysis. A range sum query applies an aggregation operation (e.g., SUM) over all selected cells in a data cube, where the selection is specified by providing ranges of values for numeric dimensions. Many application domains require that information provided by analysis tools be current or "near-current." Existing techniques for range sum ...
متن کاملOn the Optimality of the Greedy Heuristic in Wavelet Synopses for Range Queries
In recent years wavelet based synopses were shown to be effective for approximate queries in database systems. The simplest wavelet synopses are constructed by computing the Haar transform over a vector consisting of either the raw-data or the prefix-sums of the data, and using a greedy-heuristic to select the wavelet coefficients that are kept in the synopsis. The greedy-heuristic is known to ...
متن کاملData Cubes in Dynamic Environments
The data cube, also known in the OLAP community as the multidimensional database, is designed to provide aggregate information that can be used to analyze the contents of databases and data warehouses. Previous research mainly focussed on strategies for supporting queries, assuming that updates do not play an important role and can be propagated to the data cube in batches. While this might be ...
متن کامل